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ABSTRACT 

A strategy that has been initiated to respond to 
assertions that out-of-date norms distort standardized achievement 
test results involves annually updating the norms for achievement 
tests to avoid the production of inflated scores through aging norms. 
The effect of the application of normative trend data to the obtained 
test results in an urban school district was evaluated* The analysis 
included data for the Spring 1988 administration of the California 
Achievement Test (CAT), Form E, in reading (n-54,871) and mathematics 
(n=17,722). The district participated, with a number of other 
schools, in CTB/McGraw Hill's Normative Trend Data (NTD) project, 
which involved renorming the CAT for that year. Data were then 
applied to the original results to transform the obtained frequency 
distributions for each subtest at each grade. Redistributed scores 
were then compared to the origincil distributions to assess the impact 
of the updated norms. The use of NTD data resulted in lowering the 
measure of achievement levels in reading and mathematics. NTD scores 
positively skewed the district's grade level achievement 
distributions. Annual national perforraance app'^ared to improve 
relative to the original standardization sample. Longitudinwil 
comparisons for the district must, however, rely on the original 
standardization, and the interpretation and explanation of two sees 
of scores in a district compromise the utility of the results; . An 
appendix contains 55 figures jllustratiny test score distributions • 
(SLD) 
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•■ Current Norms: Do They "Deflate" Test Scores? 
A Study of Normative Trend Data for an Urban School District 

ABSTRACT 



Introductio n 

Recent criticism of the use of standardized achievement test results 
among school district nationwide by the Friends for Education has indicated 
that none of the 50 states reported being the below norm (Cannell, 1988). 
These allegftions have fostered numerous studies and strategies among members 
of the education and testing communities to respond to these assertions 
(Koretz, 1988; Linn, 1989; and Shepard, 1989 among others). One strategy 
which has been suggested and initiated is that of annually updating the norms 
for achievement tests to avoid the phenomenon of aging norms producing 
inflated scores. The Normative Trend Data program made available by CTB/ 
McGraw-Hill for its customers is one example of this strategy; annual test 
results of participant districts are utilized to illustrate changes in 
national averages in achievement for those taking the CTB tests. 

The purpose of this study was to evaluate the effect of the 
application of normative trend data to the obtained test results in an urban 
school district. 

The study sought to respond to the following questions concerning 
the utilization of normative trend data. 

1. What affect would the use of updated norms have upon the 
district-wide achievement test results in reading and 
mathematics? 

2. Does comparing (transforming) scores with updated normative data 
'normalize' the score distributions of district-wide achievement 
results (as Friends for Education would prefer)? 



Methodology 

The methods employed included the analysis of the district's 
complete available test results file for the Spring, 1988 administration of 
the California Achievement Test (Form E) in reading (n = 54,871) and 
mathematics (n = 17,722) in selected grade levels. The district participated 
in the publisher's Normative Trend Data Project which involved the "renorming" 
of the CAT for that year. The data which was made available to the district 
from this effort was then applied to the original results to "transform" the 
obtained frequency distributions for each subtest at each grade. The redis- 
tributed scores (in local quartiles) were then compared to the original 
distributions to assess the impact of the updated norms, 
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Findincis 



Based upon original norms, the local quartile performance of 
district's grade one through 11 students fell below the national quartile 
levels in reading in all but five of the 99 comparisons. 

In mathematics, however, 21 of the 63 quartile comparisons made 
found local performances equal to or greater than original normed quartiles. 

Using NTD scores, local readirn quartile performances remained below 
the norm; further, NTD scores indicated a lowering of quartile levels (from 
original norms) in 63 of the 99 corrparisons. 

Applying NTD standards in mathematics, only three of the 63 
comparisons remained above the normed levels; NTD quartiles were lower than 
the original norms in all comparisons. 

Similar findings concerning the median score ,evels were evident in 
the graphic score distributions contained in the study's appendix. 



Conclusions 

The use of Normative Trend Data to interpret test scores resulted in 
a lowering of the measure of achievement levels in the district in reading and 
mathematics. NTD scores positively skewed the district's grade level achieve- 
ment distributions. 

Annual national performance in reading and mathematics appeared to 
improve since the original test norming (based upon the reinterpreted quartile 
scores). 

While the use of current norms might serve to satisfy many concerns 
related to the use of standardized test results as a measure of achievement, 
the practicality of using two sets of norms for the ;:ieasure of achievement in 
a school district is debatable. 
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: Current Norms: Do They "Deflate" Test Scores? 
A Study of Normative Trend Data for an Urban School District 

Introduction 

Recent criticism of the use of standardized achievement test results 
among school districts nationwide by the Friends for Education has indicated 
that none of the 50 states reported being the below norm (Cannell, 1988). 
These allegations have fostered numerous studies and strategies among members 
of the education and testing communities to respond to these assertions 
(Koretz, 1988; Linn, 1989; and Shepard, 1989 among others). One strategy 
which has been suggested and initiated is that of annually updating the norms 
for achievement tests to avoid the phenomenon of aging norms producing 
inflated scores. The Normative Trend Data program made available by CTB/ 
McGraw-Hill for its customers is one example of this strategy; annual test 
results of participant districts are utilized to illustrate changes in 
national averages in achievement for those taking the CTB tests. 

The purpose of this study was to evaluate the effect of the 
application of normative trend data to the obtained test results in an urban 
school district. 

The study sought to respond to the following questions concerning 
the utilization of normative trend data. 

1. What effect would the use of updated norms have upon the 
district-«ide achievement test results in reading and mathe- 
matics? 

2. Does comparing (transforming) scores with updated normative data 
'no malize' the ncore distributions of district-wide achievement 
results (as Friends for Education would prefer)? 
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Review of the literature 

In 1988, John Jacob Cannell reported that "no state is below the 
norm at the elementary level on any of the six major nationally normed, 
commercially available tests!" This statement resulted from a study which his 
organization, Friends for Education, conducted in response to a perceived 
discrepancy between the performance of students in the West Virginia schools 
on nationally normed achievement tests and the state's citizens' performance 
on other education achievement indicators (college degrees, ACT results, and 
per capita income levels). Cannell wondered how this discrepancy could exist 
in his and, literally, most of the 49 other states in the union. Cannell 
went on to analyze the phenomenon by making the following allegations 
concerning the use and interpretatiori of standardized achievement tests. 
While "educators claim that the high scores reflect improved acuievement 
levels, Triends for Education suspects that inaccurate initial norms and 
teaching the test may be the reasons for high scores" (Cannell, 1988). Sub- 
sequently, members of the education community (Phillips and Finn, 1988; and 
Stonehill, 1988) and the test and measurement community (Drahozal and Frisbie, 
1988; Lenke and Keene, 1988; Williams, 1988; and Quails-Payne, 1988) responded 
in force. 

While most concurred with Dr. Cannell 's findings, they sought to 
defend the use of normed referenced tests and their legitimacy as a valid 
measure for school children's achievement. One recurrent theme among all 
respondents was the issue of the recency of the test norms. Koretz (1988) 
recognized this issue as a function of the test publishing cycle. "[N]ornis 
become increasingly dated until a new [test] edition is introduced. Students 
are compared to a national standard that is sometimes more than half a decade 
out of date" (Koretz, 1988). "Obviously, those who compare the 1987 
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performance of their pupils with that of other pupils who wdre tested in 1978 
(national stands 'dizati on) will be usinq 'softer ' norms and will have more 
pupils appearing to be above the national average than really are" (Drahozal 
and Frisbie, 1988). This phenomenon might be best addressed by more frequent 
renorming of tests or even annual norming of standardized achievement tests. 
Three of the four testing company respondents indicated that the provision of 
annual norms was already available or currently under development. Although 
these 'updated norms' cannot be used in place of the original stanr^ardization 
norms "[s]uch data could be used to amplify the standardization nurms and 
provide a more complete picture on the progress local school districts were 
making in their Instructional efforts" (Williams, 1988). 
Methodology 

The methods employed included the analysis of the district's 
complete available test results file for the Spring, 1988 administration of 
the California Achievement Test (Form E) in reading (n = 54,871) and mathe- 
matics (n = 17,722) in selected grade levels. These test results represent an 
annual district-wide testing effort at grades one through 11 in reading and at 
grades three through nine in mathematics. The reading test was administered 
to approximately 80 percent of the district's 68,000 pupil enrollment in 
grades one through 12 in 1988 (only results for grades one through 11 were 
utilized in this study). The enrolled population was 70 percent black, 23 
percent white, five percent Hispanic, and two percent other races in 1988. 
Similar testing rates were evident amoncj the mathematics results obtained for 
this study. 

Normative Trend Data Project 

CTB-McGraw-Hill , the publishers of the California Achievement Test 
invited all users to participate in its normative trend data (NTD) program by 
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availing the publisher of the current test results data. CTB aggregated the 
available (n 95,000 to 160,000 per grade) data using their original norming 
stratification design in order to produce 1988 NTD percentile and NCE tables 
for sco'^ing and interpretation. Use of the "updated" norms was cautioned as 
follows. 

Sampling techniques varied between the tests' original st-indardi- 
zation and the NTD. The respective representativeness was unknown. Grade 
level test usage was varied, therefore sample variations among grades was 
evident. Urban stratification cells appeared to be under-represented thereby 
affecting the NTD samples. Finally, test familiarity among users affected the 
representativeness of the NTD sample (See Roudabush, 1989). 
The Cleveland School District Data 

Upon completion of the NTD project, CTB provided the schoo^ district 
with current (1988) national percentile rank values which corresponded to the 
original normed percentile ranks of the district's local quartile cut points. 
Tables 1 and 2 illustrate the respective 1986 (original norm) percentile rank 
and 1988 (NTD) percentile rank corresponding to the District's local 25th, 
50th and 75th percentile for reading and mathematics subtests 'n selected 
grades. For example, a full 75 percent of those students tested in grade 
three vocabulary scored at or below the 61st percentile rank ('86 norm). 
These students scored at or bylow the 59th percentile rank per NTD ('88^ 
norms. 

Additionally, data for each of the subtests per grade were 
aggregated (id plotted. A graphic comparison of these results focused upon 
the median scores for reading and mathematics (see Appendix A). Distributions 
of each subtest at each grade were plotted with reference marks at the 50th 
percentile rank (national norm). Tlie district's median scores (50th local 
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Cleveland City School District 

1988 CAT Reading Test 
Comparative Percentile Ranks 
Local, National, and NTH Quartiles 



Local Vocabulary 
Grade %ile Norm('86) NTD('88) 
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Comprehension Total Reading 

Norm('86) NTD('88) Norm('86) NTD('88) 
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Table 2 

Cleveland City School District 

1988 CAT Mathematics Test 
Comparative Percentile Ranks 
..ocal, National, and NTD Quartiles 



Grade 



Local 



Computation Conceots & Application 

Normi'86) NTD('88) NormC'ee) NTD('88) 
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percentile) itr terms of nationally normed and NTD scores were then 

superimposed on the distributions. 

findings 

By the comparison, the District's local quartile performance was 
below nationally normed quartile levels in all cases (n = 33) compared in 
grades one through 11 in reading vocabulary. 

Comparisons of the local quartile performances in reading 
comprehension indicated only five cases where the local lowest quartile met or 
exceeded the national norm quartile. All others (n=28) compared to a lower 
national normed percentile rank. 

The reading total local quartile performances were below the 
nationally normed quartile levels in all cases compared (n = 33). 

The second set of comparisons involved the local quartiles with the 
NTD quartile levels. Two types of observations emerged from this comparison. 
First, in all cases of vocabulary, reading comprehension, and total reading 
quartiles reviewed, local quartiles performances remained below the NTD 
quartile levels. Secondly, in only 28 of the 99 comparisons made, NTD scores 
equalled (N=13) or exceeded (n=15) the original norm levels at each quartile. 
As a result, an original below-district norm performance status, in general, 
regressed further when compared to NTD scores. 

The results of comparisons, in mathematics scores were somewhat 
different. First, in 21 of the 63 comparisons of mathematics subtests and 
total results, local quartile performances equc'lled or exceeded the nationally 
normed level of 1986. In all other cases (n = A2) the local quartiles were 
below national levels. 

Subsequent comparisons of local mathematics quartile performances 
with NTD scores indicated that only three of the 63 cases remained above the 
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normed score while all others fell below. Also, in all comparisons, the NTD 
score quartile levels were lower than original norm scores. Again a 
diminishing pupil performance record in mathematics was effected by utilizing 
NTD scores. 
Score Distributions 

The findings of the graphic comparisons reiterate those mentioned 
above. Specifically, the district's median for its local distribution in 
reading fell below the nationally normed median in all cases cornpared. 
Additionally, in all but ten cases, the district's reading distribution median 
declined further when compared to the NTD scores. Four of the 10 exceptions 
indicated no change while six evidenced one to three point improvements when 
referencing NTD scores. 

Similar comparisons in the area of mathamatics finds local medians 
at or above the national normed medians ;,. *he computation subtest and total 
test scores at three of the seven grades levels studied. When comparing the 
local median to NTD scores, only one grade level remained higher than the norm 
in computation and total test scores. In all cases, the median score declined 
when compared to NTD scores as apposed to the original norms. 
Conclusions 

The use of normative trend data to interpret local test scores 
resuUed in a lowering of the measure of achievement levels in the district. 
With very few exceptions, reinterpreted (NTD) scores resulted in lower 
quartile levels for the District's pupils "n reiiding and math. Additionally, 
score comparisons to "updated" NTD scores serve tj positively skew the 
district's grade level distributions. In a district whose scores represent 
below normal status when compared to original test norms, NTD scorfjs further 
sKew the results. 



The-general trend of lower ranking distributions when utilizing NTD 
scores supported contentions of the test publishers and critics. The annual 
performances on standardized reading and mathematics tests appeared to 
improve relative to the original standardization sample. The use of more 
recent norms serve to interpret annual -.ults of test users relative to their 
peers however, longitudinal comparisons must rely upon the original standardi- 
zation. Additionally, it must be noted that NTD scores are compiled from a 
user sample and, therefore, do not reflect the same degree of representative- 
ness evident in the original effort. The documented under-representation of 
the urban cells in the NTD sample may have contributed to the relative 
lowering of normed rankings over time. 

Despite the notion that annual norms adequetly address the issues 
discussed in the literature, the interpretation and explanation of two sets of 
scores representing achievement testing in a district severely compromises 
their utility. 
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APPENDIX A 

1988 CAT - Reading and Mathematics 
Test Score Distributions 
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1988 CAT Reriding Test Plots 



The plotted values represent aggregated frequencies of percentile 
rank scores for pupils tested in each grades. 



Legend: I - National Norm Median 

- Local Median per 1986 Norms 

- Local Median per 1988 NTD 

Note: Single solid line on the plot indicates '86 and NTD reference ranks are 
equal . 
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